Bug 1917484: Don't adopt after clean failure during deprovisioning#122
Conversation
Add a set of functions that lend semantic meaning to the various Result values that can be returned - some of which result in identical Result structures being returned to the caller. This is the first step toward improving the provisioner API to allow the controller more insight into how it should respond to events in the provisioner. The six possible events are: * A conflict that should be retried with a delay * A change to the Host Status that requires it to be written back to the k8s API * Waiting for an ongoing process * Successful completion of the current operation * A failure of the current operation * A transient error (e.g. network connection failure) (cherry picked from commit 0e1acfe) Signed-off-by: Honza Pokorny <honza@redhat.com>
During deprovisioning of a Host, if 'deleting' (i.e. deprovisioning) the node succeeds (i.e. it doesn't go to the Error state) but the automated cleaning that follows fails, the only way to recover is to return the node to the manageable state. Previously, once in the manageable state we would attempt adoption on the node so that we could deprovision again. However, in the course of 'deleting' the node, the image information is cleared from it so it cannot be adopted again. (Adoption continues to be the right thing to do if the node has just been re-registered due to the Ironic database being recreated, and in that case the image information is present since it gets added during the initial registration.) To work around this, don't attempt to adopt during the Deprovisioning state if the node is manageable and the image data is not present. Handle the manageable state in Deprovision() by declaring the deprovisioning complete. A node in the manageable state cannot be re-provisioned without first being cleaned - it must go through cleaning to reach the available state before it can be provisioned. Provisioning already handles nodes in the manageable state, as this is how they begin after the initial inspection of the host before the first provisioning (which does the initial cleaning). (cherry picked from commit ba38688) Signed-off-by: Honza Pokorny <honza@redhat.com>
|
@honza: This pull request references Bugzilla bug 1917484, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano, honza The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@honza: All pull requests linked via external trackers have merged: Bugzilla bug 1917484 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is meant to supersede #121. It adds the commit
Ironic: Add result functionswithout which it cannot be built.